Skip to content

Implement proper behaviour of recording/replaying RDMA_VERBS_IOCTL#2

Draft
sodar wants to merge 5 commits intorr-with-dpdkfrom
rr-with-dpdk-fix-rdma-verbs-ioctl
Draft

Implement proper behaviour of recording/replaying RDMA_VERBS_IOCTL#2
sodar wants to merge 5 commits intorr-with-dpdkfrom
rr-with-dpdk-fix-rdma-verbs-ioctl

Conversation

@sodar
Copy link
Owner

@sodar sodar commented Sep 4, 2020

TODO:

  • proper logging of RDMA_VERBS_IOCTL ioctl;
  • handle getting value in & out (how?);

@sodar sodar self-assigned this Sep 4, 2020
@sodar
Copy link
Owner Author

sodar commented Sep 4, 2020

Notes from debugging failing ioctl (which causes failure to execute an expected mmap, which was recorded):

(gdb) bt
#0  0x00007ffff687987b in ioctl () from /lib64/libc.so.6
#1  0x000000000678eef2 in execute_ioctl (context=0x75fcd00, cmd=0x7fffffffdc10) at /home/ds/dev/rdma-core/libibverbs/cmd_ioctl.c:147
#2  0x000000000678ffd2 in ioctl_write (ctx=0x75fcd00, write_method=0, req=0x7fffffffe108, core_req_size=8, req_size=40,
    resp=0x7fffffffe0b0, core_resp_size=8, resp_size=80) at /home/ds/dev/rdma-core/libibverbs/cmd_fallback.c:230
#3  0x000000000679004f in _execute_cmd_write (ctx=0x75fcd00, write_method=0, req=0x7fffffffe100, core_req_size=16, req_size=48,
    resp=0x7fffffffe0b0, core_resp_size=8, resp_size=80) at /home/ds/dev/rdma-core/libibverbs/cmd_fallback.c:241
#4  0x000000000678868f in cmd_get_context (context_ex=0x75fcbc0, link=0x7fffffffdf60) at /home/ds/dev/rdma-core/libibverbs/cmd_device.c:145
#5  0x0000000006788915 in ibv_cmd_get_context (context_ex=0x75fcbc0, cmd=0x7fffffffe100, cmd_size=48, resp=0x7fffffffe0b0, resp_size=80)
    at /home/ds/dev/rdma-core/libibverbs/cmd_device.c:176
#6  0x000000000677c7b9 in mlx5_cmd_get_context (context=0x75fcbc0, req=0x7fffffffe100, req_len=48, resp=0x7fffffffe0b0, resp_len=80)
    at /home/ds/dev/rdma-core/providers/mlx5/mlx5.c:611
#7  0x000000000677f032 in mlx5_alloc_context (ibdev=0x75c80e0, cmd_fd=79, private_data=0x7fffffffe1d0)
    at /home/ds/dev/rdma-core/providers/mlx5/mlx5.c:1595
#8  0x000000000679adaf in verbs_open_device (device=0x75c80e0, private_data=0x7fffffffe1d0)
    at /home/ds/dev/rdma-core/libibverbs/device.c:362
#9  0x000000000677dfae in mlx5dv_open_device (device=0x75c80e0, attr=0x7fffffffe1d0) at /home/ds/dev/rdma-core/providers/mlx5/mlx5.c:1302
#10 0x000000000049ed0f in mlx5_glue_dv_open_device (device=0x75c80e0) at ../dpdk/drivers/common/mlx5/linux/mlx5_glue.c:844
#11 0x000000000419b8bc in mlx5_os_open_device (spawn=0x1003e9d40, config=0x7fffffffe590, sh=0x1003d5c80)
    at ../dpdk/drivers/net/mlx5/linux/mlx5_os.c:2034
#12 0x0000000001d1fbcd in mlx5_alloc_shared_dev_ctx (spawn=0x1003e9d40, config=0x7fffffffe590) at ../dpdk/drivers/net/mlx5/mlx5.c:898
#13 0x0000000004197cd6 in mlx5_dev_spawn (dpdk_dev=0x75c87b0, spawn=0x1003e9d40, config=0x7fffffffe590)
    at ../dpdk/drivers/net/mlx5/linux/mlx5_os.c:627
#14 0x000000000419b3cf in mlx5_os_pci_probe (pci_drv=0x6e55f40 <mlx5_pci_driver>, pci_dev=0x75c87a0)
    at ../dpdk/drivers/net/mlx5/linux/mlx5_os.c:1881
#15 0x0000000000499bfa in drivers_probe (dev=0x1003e9ec0, pci_drv=0x6e55f40 <mlx5_pci_driver>, pci_dev=0x75c87a0, user_classes=1)
    at ../dpdk/drivers/common/mlx5/mlx5_common_pci.c:244
#16 0x0000000000499de5 in mlx5_common_pci_probe (pci_drv=0x6e55f40 <mlx5_pci_driver>, pci_dev=0x75c87a0)
    at ../dpdk/drivers/common/mlx5/mlx5_common_pci.c:306
#17 0x00000000004827ae in rte_pci_probe_one_driver (dr=0x6e55f40 <mlx5_pci_driver>, dev=0x75c87a0)
    at ../dpdk/drivers/bus/pci/pci_common.c:204
#18 0x00000000004829a5 in pci_probe_all_drivers (dev=0x75c87a0) at ../dpdk/drivers/bus/pci/pci_common.c:279
#19 0x0000000000482a2b in pci_probe () at ../dpdk/drivers/bus/pci/pci_common.c:306
#20 0x00000000066c15bd in rte_bus_probe () at ../dpdk/lib/librte_eal/common/eal_common_bus.c:72
#21 0x00000000066fd729 in rte_eal_init (argc=7, argv=0x7fffffffeae8) at ../dpdk/lib/librte_eal/linux/eal.c:1280
#22 0x0000000000407798 in main (argc=7, argv=0x7fffffffeae8) at main.c:11

(gdb) info registers
rax            0xffffffffffffffda  -38
rbx            0x7fffffffdcc0      140737488346304
rcx            0x7ffff687987b      140737329469563
rdx            0x7fffffffdc30      140737488346160
rsi            0xc0181b01          3222805249
rdi            0x4f                79
rbp            0x7fffffffdc00      0x7fffffffdc00
rsp            0x7fffffffdbd8      0x7fffffffdbd8
r8             0x0                 0
r9             0x7fffffffe0b0      140737488347312
r10            0x3                 3
r11            0x3206              12806
r12            0x3                 3
r13            0x0                 0
r14            0x3                 3
r15            0x0                 0
rip            0x7ffff687987b      0x7ffff687987b <ioctl+11>
eflags         0x3206              [ PF IF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0
(gdb) ni
0x00007ffff687987b in ioctl () from /lib64/libc.so.6
(gdb) info registers
rax            0x0                 0
rbx            0x7fffffffdcc0      140737488346304
rcx            0x7ffff687987b      140737329469563
rdx            0x7fffffffdc30      140737488346160
rsi            0xc0181b01          3222805249
rdi            0x4f                79
rbp            0x7fffffffdc00      0x7fffffffdc00
rsp            0x7fffffffdbd8      0x7fffffffdbd8
r8             0x0                 0
r9             0x7fffffffe0b0      140737488347312
r10            0x3                 3
r11            0x3206              12806
r12            0x3                 3
r13            0x0                 0
r14            0x3                 3
r15            0x0                 0
rip            0x7ffff687987b      0x7ffff687987b <ioctl+11>
eflags         0x3206              [ PF IF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0

(gdb) frame 1
#1  0x000000000678eef2 in execute_ioctl (context=0x75fcd00, cmd=0x7fffffffdc10) at /home/ds/dev/rdma-core/libibverbs/cmd_ioctl.c:147
147     /home/ds/dev/rdma-core/libibverbs/cmd_ioctl.c: No such file or directory.
(gdb) p cmd->hdr
$11 = {length = 104, object_id = 0, method_id = 0, num_attrs = 5, reserved1 = 0, driver_id = 1, reserved2 = 0, attrs = 0x7fffffffdc48}
(gdb) p *((struct ib_uverbs_attr *)(0x7fffffffdc48 + 0x10 * 0))
$12 = {attr_id = 2, len = 8, flags = 1, attr_data = {enum_data = {elem_id = 0 '\000', reserved = 0 '\000'}, reserved = 0}, {data = 0,
    data_s64 = 0}}
(gdb) p *((struct ib_uverbs_attr *)(0x7fffffffdc48 + 0x10 * 1))
$13 = {attr_id = 0, len = 8, flags = 1, attr_data = {enum_data = {elem_id = 0 '\000', reserved = 0 '\000'}, reserved = 0}, {
    data = 140737488347312, data_s64 = 140737488347312}}
(gdb) p *((struct ib_uverbs_attr *)(0x7fffffffdc48 + 0x10 * 2))
$14 = {attr_id = 1, len = 8, flags = 3, attr_data = {enum_data = {elem_id = 0 '\000', reserved = 0 '\000'}, reserved = 0}, {
    data = 140737488347312, data_s64 = 140737488347312}}
(gdb) p *((struct ib_uverbs_attr *)(0x7fffffffdc48 + 0x10 * 3))
$15 = {attr_id = 4096, len = 32, flags = 1, attr_data = {enum_data = {elem_id = 0 '\000', reserved = 0 '\000'}, reserved = 0}, {
    data = 140737488347408, data_s64 = 140737488347408}}
(gdb) p *((struct ib_uverbs_attr *)(0x7fffffffdc48 + 0x10 * 4))
$16 = {attr_id = 4097, len = 72, flags = 3, attr_data = {enum_data = {elem_id = 0 '\000', reserved = 0 '\000'}, reserved = 0}, {
    data = 140737488347320, data_s64 = 140737488347320}}
  • cmd_get_context fills struct mlx5_alloc_ucontext_resp resp (providers/mlx/mlx5.c), not mlx5_ib_alloc_ucontext_resp;
  • This is field is probably used as a buffer, because after stepping out correct value is set - resp.drv_payload.tot_bfregs = 16;

@sodar
Copy link
Owner Author

sodar commented Sep 4, 2020

Script used to obtain the trace:

#!/bin/bash

set -ex

#MLX5_STALL_CQ_POLL=1 RR_LOG=all:debug RR_LOG_FILE=/root/dpdk-example-20200905T0003.log /opt/dev/rr/bin/rr record -n \

MLX5_STALL_CQ_POLL=1 RR_LOG=all:warn /opt/dev/rr/bin/rr record -n \
    ./dpdk-example \
        -w pci:0000:01:00.0 \
        --log-level lib.eal:debug \
        --log-level pmd.net.mlx5:debug

struct ib_uverbs_ioctl_hdr size is defined dynamically, so on
RDMA_VERBS_IOCTL command before registering IN_OUT parameter
rr has to lookup .length field.
`struct ib_uverbs_ioctl_hdr` as an VLA of `struct ib_uverbs_attr`. Each of this
can hold a pointer to userspace buffer or a piece of data.

Kernel driver distinguish between data/pointer based on internal kernel data,
so here memory buffer is recorded in scratch if:

- data length is longer than 8 (then it is most certainly a buffer);
- or data length is equal to 8 and data addr points to a valid tracee memory
  (however it might not be accurate, data length might be 8 and can be treated as data,
  but store scratch anyway).

On top of that attr struct is always stored in scratch.
@sodar sodar force-pushed the rr-with-dpdk-fix-rdma-verbs-ioctl branch from 7717691 to 87fe972 Compare September 19, 2020 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant